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Abstract. A grounding of a formula (j) over a given finite domain is a ground formula which is equiva- 
lent to (j) on that domain. Very effective propositional solvers have made grounding-based methods for 
problem solving increasingly important, however for realistic problem domains and instances, the size 
of groundings is often problematic. A key technique in ground (e.g., SAT) solvers is unit propagation, 
which often significantly reduces ground formula size even before search begins. We define a "lifted" 
version of unit propagation which may be carried out prior to grounding, and describe integration of 
the resulting technique into grounding algorithms. We describe an implementation of the method in a 
bottom-up grounder, and an experimental study of its performance. 

1 Introduction 

Grounding is central in many systems for solving combinatorial problems based on declarative specifica- 
tions. In grounding-based systems, a "grounder" combines a problem specification with a problem instance 
to produce a ground formula which represents the solutions for the instance. A solution (if there is one) is 
obtained by sending this formula to a "ground solver", such as a SAT solver or propositional answer set 
programming (ASP) solver. Many systems have specifications given in extensions or restricti ons of clas- 
sical first order logic (FO), including: IDP [WMDOScl, MXG |Moh041, Enfragmo fATU+10 A WTMlll . 
ASPPS IIET06L and Kodkod |TJ07|. Specifications for ASP systems, such as DLV I.LPF+06J and chngo 
llGKK+0811 . are (extended) normal logic programs under stable model semantics. 

Here our focus is grounding specifications in the form of FO formulas. In this setting, formula (f) con- 
stitutes a specification of a problem (e.g., graph 3-colouring), and a problem instance is a finite structure A 
(e.g., a graph). The grounder, roughly, must produce a ground formula ijj which is logically equivalent to 
over the domain of A. Then ip can be transformed into a propositional CNF formula, and given as input to 
a SAT solver. If a satisfying assignment is found, a solution to A can be constructed from it. ASP systems 
use an analogous process. 

A "naive" grounding of over a finite domain A can be obtained by replacing each sub-formula of 
the form 3xil]{x) with \/ aeA'^i'^)^ where a is a constant symbol which denotes domain element a, and 
similarly replacing each subformula yxip{x) with a conjunction. For a fixed FO formula (p, this can be 
done in time polynomial in \A\. Most grounders use refinements of this method, implemented top-down or 
bottom-up, and perform well on simple benchmark problems and small instances. However, as we tackle 
more realistic problems with complex specifications and instances having large domains, the groundings 
produced can become prohibitively large. This can be the case even when the formulas are "not too hard". 
That is, the system performance is poor because of time spent generating and manipulating this large ground 
formula, yet an essentially equivalent but smaller formula can be solved in reasonable time. This work 
represents one direction in our group's efforts to develop techniques which scale effectively to complex 
specifications and large instances. 

Most SAT solvers begin by executing unit propagation (UP) on the input formula (perhaps with other 
"pre-processing"). This initial application of UP often eliminates a large number of variables and clauses, 
and is done very fast. However, it may be too late: the system has already spent a good deal of time 
generating large but rather uninteresting (parts of) ground formulas, transforming them to CNF, moving 
them from the grounder to the SAT solver, building the SAT solver's data structures, etc. This suggests 
trying to execute a process similar to UP before or during grounding. 

One version of this idea was introduced in B WMDOSbl WMD lOl . The method presented there involves 
computing a symbolic and incomplete representation of the information that UP could derive, obtained 



* This author's contributions to this paper were made while he was a post-doctoral fellow at SFU. 



from (f) alone without reference to a particular instance structure. For brevity, we refer to that method as 
GWB, for "Grounding with Bounds". In fWMDOSb WMDIOI . the top-down grounder GidL IWMDOSal 
is modified to use this information, and experiments indicate it significantly reduces the size of groundings 
without taking unreasonable time. 

An alternate approach is to construct a concrete and complete representation of the information that UP 
can derive about a grounding of (f) over A, and use this information during grounding to reduce grounding 
size. This paper presents such a method, which we call lifted unit propagation (LUP). (The authors of the 
GWB papers considered this approach also IIDW08I . but to our knowledge did not implement it or report on 
it. The relationship between GWB and LUP is discussed further in Section|2]) The LUP method is roughly 
as follows. 

1 . Modify instance structure A to produce a new (partial) structure which contains information equivalent 
to that derived by executing UP on the CNF formula obtained from a grounding of over A. We call 
this new partial structure the LUP structure for (f) and A, denoted CUV{(i>,A). 

2. Run a modified (top-down or bottom-up) grounding algorithm which takes as input, and £LIV{4>,A), 
and produces a grounding of cj) over A. 

The modification in step 2 relies on the idea that a tuple in CUV{(j),A) indicates that a particular sub- 
formula has the same (known) truth value in every model. Thus, that subformula may be replaced with its 
truth value. The CNF formula obtained by grounding over CUV{<t),A) is at most as large as the formula 
that results from producing the naive grounding and then executing UP on it. Sometimes it is much smaller 
than this, because the grounding method naturally eliminates some autark sub-formulas which UP does not 
eliminate, as explained in Sections |3]and|6] 

We compute the LUP structure by constructing, from 0, an inductive definition of the relations of 
the LUP structure for and A (see Section |4]i. We implemented a semi-naive method for evaluating this 
inductive definition, based on relational algebra, within our grounder Enfragmo. (We also computed these 
definitions using the ASP grounders gringo and DLV, but these were not faster ) 

For top-down grounding (see Section[3]i, we modify the naive recursive algorithm to check the derived 
information in CUV{(f>, A) at the time of instantiating each sub-formula of This algorithm is presented 
primarily for expository purposes, and is similar to the modified top-down algorithm used for GWB in 
GidL. 

For bottom-up grounding (see Section|5]l, we revise the bottom-up grounding method based on extended 
relational algebra described in IIMTHM06IPLTG071 . which is the basis of grounders our group has been 
developing. The change required to ground using CUV{(j>, A) is a simple revision to the base case. 

In Section|6]we present an experimental evaluation of the performance of our grounder Enfragmo with 
LUP. This evaluation is limited by the fact that our LUP implementation does not support specifications 
with arithmetic or aggregates, and a shortage of interesting benchmarks which have natural specifications 
without these features. Within the limited domains we have tested to date, we found: 

1 . CNF formulas produced by Enfragmo with LUP are always smaller than the result of running UP on 
the CNF formula produced by Enfragmo without LUP, and in some cases much smaller 

2. CNF formulas produced by Enfragmo with LUP are always smaller than the ground formulas produced 
by GidL , with or without GWB turned on. 

3. Grounding over CUV{(f>, A) is always slower than grounding without, but CNF transformation with 
LUP is almost always faster than without. 

4. Total solving time for Enfragmo with LUP is sometimes significantly less than that of Enfragmo with- 
out LUP, but in other cases is somewhat greater. 

5. Enfragmo with LUP and the SAT solver MiniSat always runs faster than the IDP system (GidL with 
ground solver MiniSat(ID)), with or without the GWB method turned on in GidL. 

Determining the extent to which these observations generaUze is future work. 

2 FO Model Expansion and Grounding 

A natural formalization of combinatorial search problems and their specifications is as the logical task of 
model expansion (MX) HMTl II . Here, we define MX for the special case of FO. Recall that a structure B 



for vocabulary cr U e is an expansion of cr-structure A iff A and B have the same domain {A ~ B), and 
interpret their common vocabulary identically, i.e., for each symbol R of a, R'^ — R-^. Also, if B is an 
expansion of cr-structure A, then A is the reduct of B defined by cr. 

Definition 1 (Model Expansion for FO). 

Given: A FO formula (j) on vocabulary a U e and a a-structure A 
Find: an expansion B of A that satisfies (j). 

In the present context, the formula (/> constitutes a problem specification, the structure A a problem 
instance, and expansions of A which satisfy are solutions for A. Thus, we call the vocabulary of A, the 
instance vocabulary, denoted by cr, and e the expansion vocabulary. We sometimes say 4> is ^-satisfiable if 
there exists an expansion B of A that satisfies 0. 

Example 1. Consider the following formula (f): 

yx[{R{x)y B{x)\/G{x)) A -^{R{x)AB{x)) A -^{R{x) AG{x)) A -^{B{x) AG{x))] 

A yxWy[E{x,y) D {^iR{x) AR{y)) A ^{B{x)AB{y)) A ^[G{x) AG{y)))]. 

A finite structure A over vocabulary cr = {E}, where is a binary relation symbol, is a graph. Given 
graph A = Q = {V\ E), there is an expansion Bof A that satisfies 0, iff Q is 3-colourable. So constitutes 
a specification of the problem of graph 3-colouring. To illustrate: 

A 

^ ^ ^ 

B 

An interpretation for the expansion vocabulary e := {i?, B, G} given by structure B is a colouring of Q, 
and the proper 3-colourings of Q are the interpretations of e in structures B that satisfy (p. 

2.1 Grounding for Model Expansion 

Given and A, we want to produce a CNF formula (for input to a SAT solver), which represents the 
solutions to A. We do this in two steps: grounding, followed by transformation to CNF. The grounding 
step produces a ground formula which is equivalent to over expansions of A. To produce ip, we bring 
domain elements into the syntax by expanding the vocabulary with a new constant symbol for each domain 
element. For A, the domain of A, we denote this set of constants by A. For each a G A, we write 5 for the 
corresponding symbol in A. We also write a, where a is a tuple. 

Definition 2 (Grounding of (j) over AC). Let cf) be a formula of vocabulary a U e, A be a finite a-structure, 
and ^ be a ground formula of vocabulary p, where 3 cr U £ U A Then 7p is a grounding of (j) over A if 
and only if: 

1. if (j) is A-satisfiable then ip is A-satisfiable; 

2. ifB is a p-structure which is an expansion of A and gives A the intended interpretation, and B \= ip, 
then B \^ (p. 

We call tp a reduced grounding if it contains no symbols of the instance vocabulary a. 

Definition |2] is a slight generalization of that used in ||MTHM06|PLTG(J71 . in that it allows ip to have 
vocabulary symbols not in crUeU A. This generalization allows us to apply a Tseitin-style CNF transforma- 
tion in such a way that the resulting CNF formula is still a grounding of p over A. If B is an expansion of A 
satisfying ip, then the reduct of B defined by cr U e is an expansion of A that satisfies (p. For the remainder 
of the paper, we assume that p is in negation normal form (NNF), i.e., negations are applied only to atoms. 
Any formula may be transformed in linear time to an equivalent formula in NNF. 

Algorithm[T|produces the "naive grounding" of p over A mentioned in the introduction. A substitution 
is a set of pairs {x/a), where a; is a variable and a a constant symbol. If ^? is a substitution, then p[9] denotes 



Algorithm 1 Top-Down Naive Grounding of NNF formula 4> over A 



'Pirn 



if (f} is an atom P{x) 

if is a negated atom -^P{x) 




if </> = Ai V". 

if <^ = V, ^^ 

ii (j) = \lx %p 

li (j) = 3x 



A,g^ NaiveGnd^(V', [9 U {x/h)]) i: 
.VaeANaiveGnd^(V', [6'U(a;/a)]) i: 



the result of substituting constant symbol a for each free occurrence of variable x in 0, for every [x/a) 
in 6. We allow conjunction and disjunction to be connectives of arbitrary arity. That is (A 02 4>7i) is a 
formula, not just an abbreviation for some parenthesization of (01 A 02 A 03 ). The initial call to Algorithm 
[T]is NaivcGnd^(0, 0), where is the empty substitution. 

The ground formula produced by Algorithm[T]is not a grounding of over A (according to Definition 
IS, because it does not take into account the interpretations of a given by A. To produce a grounding of 
over A, we may conjoin a set of atoms giving that information. In the remainder of the paper, we write 
NaiveGndx(0) for the result of calling NaiveGnd^(0, 0) and conjoining ground atoms to it to produce a 
grounding of over A. We may also produce a reduced grounding from NaiveGnd^(0, 0) by "evaluating 
out" all atoms of the instance vocabulary. The groundings produced by algorithms described later in this 
paper can be obtained by simplifying out certain sub-formulas of NaiveGndyi(0). 

2.2 Transformation to CNF and Unit Propagation 

To transform a ground formula to CNF, we employ the method of Tseitin IITse68l with two modifications. 
The method, usually presented for propositional formulas, involves adding a new atom corresponding to 
each sub-formula. Here, we use a version for ground FO formulas, so the resulting CNF formula is also a 
ground FO formula, over vocabulary r — aiJeiJ AiJuj, where a; is a set of new relation symbols which we 
caU "Tseitin symbols". To be precise, lo consists of a new fc-ary relation symbol [V"] for each subformula 
tp of (p with k free variables. We also formulate the transformation for formulas in which conjunction and 
disjunction may have arbitrary arity. 

Let 7 — NaiveGnd^(0, 0). Each subformula a of 7 is a grounding over ^ of a substitution instance 
tp{x)[6], of some subformula -0 of with free variables x. To describe the CNF transformation, it is useful 
to think of labelling the subformulas of 7 during grounding as follows. If a is a grounding of formula 
tp{x)[9], label a with the ground atom [■0](i)[6']. To minimize notation, we will denote this atom by a, 
setting S to a if a is an atom. Now, we have for each sub-formula a of the ground formula tp, a unique 
ground atom a, and we carry out the Tseitin transformation to CNF using these atoms. 

Definition 3. For ground formula tp, we denote by CNF(?/') the following set of ground clauses. For each 
sub-formula a ofip of form {Ai ai), include in C^¥{ip) the set of clauses {{-'O. V cfi)} U {(Vi-iOi V a)}, 
and similarly for the other connectives. 

If ip is a grounding of over A, then CNF(?/') is also. The models of ip are exactly the reducts of the 
models of CNF(?/;) defined by a U e U A. CNF('!/') can trivially be viewed as a propositional CNF formula. 
This propositional formula can be sent to a SAT solver, and if a satisfying assignment is found, a model of 
which is an expansion of A can be constructed from it. 

Definition 4 (UP(7)). Let ^ be a ground FO formula in CNF. Define UP (7), the result of applying unit 
propagation to 7, to be the fixed point of the following operation: 

If^ contains a unit clause (l), delete from each clause ofj every occurrence of^l, and delete from 
7 every clause containing I. 

Now, CNF(NaiveGND^(0)) is the result of producing the naive grounding of over A, and trans- 
forming it to CNF in the standard way, and UP(CNF(NaiveGND^(0))) is the formula obtained after 
simplifying it by executing unit propagation. These two formulas provide reference points for measuring 
the reduction in ground formula size obtained by LUP. 



3 Bound Structures and Top-down Grounding 



We present grounding algorithms, in this section and in Section |4] which produce groundings of (/) over 
a class of partial structures, which we call bound structures, related to A. The structure CUVi(j>,A) is 
a particular bound structure. In this section, we define partial structures and bound structures, and then 
present a top-down grounding algorithm. The formalization of bound structures here, and of CU'P{(t>,A) in 
Section]?] are ours, although a similar formalization was implicit in |.DW08,| . 

3.1 Partial Structures and Bound Structures 

A relational r-structure A consists of a domain A together with a relation C A^ for each fc-ary relation 
symbol of r. To talk about partial structures, in which the interpretation of a relation symbol may be only 
partially defined, it is convenient to view a structure in terms of the characteristic functions of the relations. 
Partial r-structure A consists of a domain A together with a fc-ary function Xb. ■ ~^ {T, ^, cxo}, for 
each /c-ary relation symbol R of r. Here, as elsewhere, T denotes true, _L denotes false, and oo denotes 
undefined. If each of these characteristic functions is total, then A is total. We may sometimes abuse 
terminology and call a relation partial, meaning the characteristic function interpreting the relation symbol 
in question is partial. 

Assume the natural adaptation of standard FO semantics the to the case of partial relations, e.g. with 
Kleene's 3-valued semantics |Kle52l. For any (total) r-structure B, each r-sentence cj) is either true or false 
\n B {B (f) or: B (p), and each r-formula <j){x) with free variables x, defines a relation 



Similarly, for any partial r-structure, each r-sentence is either true, false or undetermined in B, and each 
r-formula 0(5;) with free variables x defines a partial function 



There is a natural partial order on partial structures for any vocabulary r, which we may denote by <, 
where A < B iff A and B agree at all points where they are both defined, and B is defined at every point 
A is. If A < B, we may say that S is a strengthening of A. When convenient, if the vocabulary of A is 
a proper subset of that of B, we may still call B a strengthening of A, taking A to leave all symbols not 
in its vocabulary, completely undefined. We will call B a conservative strengthening of A with respect to 
formula if ;B is a strengthening of A and in addition every total structure which is a strengthening of A 
and a model of (p is also a strengthening of B. (Intuitively, we could ground <j) over B instead of A, and not 
lose any intended models.) 

The specific structures of interest are over a vocabulary expanding the vocabulary of in a certain way. 
We will call a vocabulary r a Tseitin vocabulary for (j> if it contains, in addition to the symbols of (f>, the set ut 
of Tseitin symbols for (p. We call a r-structure a "Tseitin structure for 0" if the interpretations of the Tseitin 
symbols respect the special role of those symbols in the Tseitin transformation. For example, if a is ai Aq;2, 
then S,-^ must be true iff oi"^ — a^^ = true. The vocabulary of the formula CNF(NaiveGnd^(0)) is a 
Tseitin vocabulary for (p, and every model of that formula is a Tseitin structure for (p. 

Definition 5 (Bound Structures). Let (p be a formula, and Abe a structure for a sub-set of the vocabulary 
of (p. A bound structure for (p and A is a partial Tseitin structure for <p that is a conservative strengthening 
of A with respect to (p. 

Intuitively, a bound structure provides a way to represent the information from the instance together 
with additional information, including information about the Tseitin symbols in a grounding of (p, that we 
may derive (by any means), provided that information does not eliminate any intended models. 

Let r be the minimum vocabulary for bound structures for (p and A. The bound structures for (p and A 
with vocabulary r form a lattice under the partial order <, with A the minimum element. The maximum 
element is defined exactly for the atoms of CNF(NaiveGnd^((?!))) which have the same truth value in 
every Tseitin r-structure that satisfies (p. This is the structure produced by "Most Optimum Propagator" in 



0^ = {a e ^1*1 : B h (p{x)[x/a]}. 



(1) 




(2) 



IWMDlOn . 



Definition 6 (Grounding over a bound structure). Let Abe a bound structure for <j) and A. A formula 
tp, over a Tseitin vocabulary for (p which includes A, is a grounding of (jj over A ijf 

1. if there is a total strengthening of A that satisfies (j), then there is a one that satisfies ■0," 

2. if B is a total Tseitin structure for (j) which strengthens A gives A the intended interpretation and 
satisfies ip, then it satisfies (j). 

A grounding ijj of (f) over A need not be a grounding of (j) over A. If we conjoin with %p ground atoms 
representing the information contained in A, then we do obtain a grounding of (p over A . In practice, we 
send just CNF(?/') to the SAT solver, and if a satisfying assignment is found, add the missing information 
back in at the time we construct a model for cf). 



3.2 Top-down Grounding over a Bound Structure 

Algorithm|2]produces a grounding of (j) over a bound structure A for A. Gnd and Simpl are defined by 
mutual recursion. Gnd performs expansions and substitutions, while Simpl performs lookups in A to see 
if the grounding of a sub-formula may be left out. Eval provides the base cases, evaluating ground atoms 
over aLleUAUbjinA. 



Algorithm 2 Top-Down Grounding over Bound Structure A for (j) and A 



Eval^{P, 6) 
^Eval^(P,e) 

V, Simpl ^{^,,9) 



Simpl j^{ip, 9) 



(j) is an atom P(x) 
(ji is a negated atom —■P{x) 

\j(x/a)) (jf^Wxtp 
U (x/a)) (j>^3xxp 

A N PIO] 
A N ^P[0] 
o.w 

Gnd^{tjj,9) o.w 




The stronger A is, the smaller the ground formula produced by Algorithm|2l If we set A to be undefined 
everywhere (i.e., to just give the domain), then Algorithm |2] produces NaiveGnd^((/), 0). If A is set to A, 
we get the reduced grounding obtained by evaluating instance symbols out of NaiveGnd^((/<). 

Proposition 1. Algorithm\2\produces a grounding of 4> over A. 



3.3 Autarkies and Autark Subformulas 

In the literature, an autarky IIMS85II is informally a "self-sufficient" model for some clauses which does not 
affect the remaining clauses of the formula. An autark subformula is a subformula which is satisfied by an 
autarky. To see how an autark subformula may be produced during grounding, let A = 71 V 72 and imagine 
that the value of subformula 71 is true according to our bound structure. Then A will be true, regardless 
of the value of 72, and the grounder will replace its subformula with its truth value, whereas in the case 
of naive grounding, the grounder does not have that information during the grounding. So it generates the 
set of clauses for this subformula as: {(^A V 71 V 72), (-171 V A), (^72 V A)}. Now the propagation of the 
truth value of Ai and subsequently A, results in elimination of all the three clauses, but the set of clauses 



generated for 72 will remain in the CNF formula. We call 72 and the clauses made from that subformula 
autarkies. 

The example suggests that this is a common phenomena and that the number of autarkies might be 
quite large in many groundings, as will be seen in Section|6l 

4 Lifted Unit Propagation Structures 

In this section we define CUV{(j>, A), and a method for constructing it. 

Definition 7 (CU'P{(j), A)). Let Units denote the set of unit clauses that appears during the execution of 
UP on CNF(NaiveGnd>t(0)). The LUP structure for (p and A is the unique bound structure for (p and A 
for which: 

f T {a) e Units 

Xf^^{a)^<± -rVl(5) eUnits (3) 
I 00 o.w 

Since Algorithm|2]produces a grounding, according to Definition|6] for any bound structure, it produces 
a grounding for (p over CLl'P{4), A). 

To construct CUV{(j), A), we use an inductive definition obtained from (j). In this inductive definition, 
we use distinct vocabulary symbols for the sets of tuples which A sets to true and false. The algorithm 
works based on the notion of True (False) bounds: 

Definition 8 (Formula-Bound). A True (resp. False) bound /or a subformula ■ip{x) according to bound 
structure A is the relation denoted by ( resp. ) such that: 

L ae T^, \^p^^ia) = T 
2. a e F^, ^ \ip]-^{a) ^ ± 

Naturally, when [i/;]-^(a) = 00, a is not contained in either or F^p. 

The rules of the inductive definition are given in Table [H These rules rules may be read as rules 
of FO(ID), the extension of classical logic with inductive definitions under the well-founded semantics 
lfVGRS91 DT08I, with free variables implicitly universally quantified. The type column indicates the type 
of the subformula, and the rules columns identify the rule for this subformula. Given a cr-structure A, 
we may evaluate the definitions on A, thus obtaining a set of concrete bounds for the subformulas of 
(j). The rules reflect the reasoning that UP can do. For example consider rule (ViV'i) of 4,t for 7(0;) — 
ipi{xi) V • • • V ^n{xn), and for some z G {1, . . . , N}: 

T^X^i) ^ T^{x) A f\ F^.{xj). 

This states that when a tuple a satisfies 7 but falsifies all disjuncts, ipj, of 7 except for one, namely il>i, then 
it must satisfy ipi. As a starting point, we know the value of the instance predicates, and we also assume 
that (p is ^-satisfiable. 

Example!. Let (p ^\/x ^h{x) V £^1(2:), cr {h^h}, and A = ({1, 2, 3, 4}; /j^ = {!}). The relevant 
rules from Table (HJ are: 

Ti,{x)^h{x) 
F-.ii{x){x) ^ Ti,{x) 

Tei{x){x) T^H(x)yEi(x){x) A F^J^^^){x) 
TEt{x) <r- Te^{x){x) 

We find that Te^ = {1}; in other words: i?i (1) is true in each model of (p expanding A. 

Note that this inductive definition is monotone, because (p is in Negation Normal Form (NNF). 
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F^,- (xi) + 


- F^i(x), for each i 






F-rix) i 


— Ai F^. (xi), for each i 




F^-{xi) ^ 


- F^{x) A Aj^i T^j (xj), for each i 




(AiVi) 


F-,{x) < 


— Vi F^. (xi), for each i 
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- F^{x) 
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- ^yF^{x,y) 


My ^{x,y) 


F^{x,y) ^ 


~ F^(x) AMy'^y T^{x,y') 




My ipix,y) 


F-rix) < 


- 3y F^{x,y) 


P{x) 


Fp{x) i 


~F^{x) 




P{x) 


F-rix) < 


- Fp{x) 


-.P{x) 


Tp{x) 4 








F-,{x) < 


- Tp{x) 



Table 1 : Rules for Bounds Computation 



4.1 LUP Structure Computation 

Our method for constructing CLlV{(t), A) is given in Algorithm 3. Several lines in the algorithm require 
explanation. In line 1, the \.f rules are omitted from the set of constructed rules. Because </) is in NNF, the 
J,/ rules do not contribute any information to the set of bounds. To see this, observe that every J,/ rule has 
an atom of the form F^{x) in its body. Intuitively, for one of these rules to contribute a defined bound, 
certain information must have previously been obtained regarding bounds for its parent. It can be shown, 
by induction, that, in every case, the information about a bound inferred by an application of a J,/ rule 
must have previously been infeiTed by a t/ rule. In line 2 of the algorithm we compute bounds using only 
the two sets of rules, \t and t/- This is justified by the fact that applying {ft, \t, t/} to a fixpoint has the 
same effect as applying t/} to a fixpoint and then applying the rules afterwards. So we postpone 
the execution of the rules to line 7. 

Line 3 checks for the case that the definition has no model, which is to say that the rules allow us to 
derive that some atom is both in the true bound and the false bound for some subformula. This happens 
exactly when UP applied to the naive grounding would detect inconsistency. 

Finally, in lines 6 and 7 we throw away the true bounds for all non-atomic subformulas, and then 
compute new bounds by evaluating the rules, taking already computed bounds (with true bounds for 
non-atoms set to empty) as the initial bounds in the computation. To see why, observe that the true bounds 
computed in line 2 are based on the assumption that is ^-satisfiable. So [(/)] is set to true which stops the 
top-down bounded grounding algorithm of Sec tion l372l from producing a grounding for That is because 
the Simpl function, considering the true bound for the simply returns T instead of calling Gndji^., .) 
on subformulas of the (j). This also holds for all the formulas with true-bounds, calculated this way, except 
for the atomic formulas. So, we delete these true bounds based on the initial unjustified assumption, and 



Algorithm 3 Computation of CUV((f>, A) 
1: Construct the rules {^t, It, t/} 

2: Compute bounds by evaluating the inductive definition {4-t, t.f } 
3: if Bounds are inconsistent then 
4: return "A has no solution" 
5: end if 

6: Throw away T^(x) for all non-atomic subformulas ^l)(x) 

1: Compute new bounds by evaluating the inductive definition {'\t} 

8: return LUP structure constructed from the computed bounds, according to Definition[8]. 



then construct the correct true bounds by application of the '\t rules, in line 7. This is the main reason for 
postponing the execution of 'ft rules. 

5 Bottom-up Grounding over Bound Structures 

The grounding algorithm we use in Enfragmo constructs a grounding by a bottom-up process that parallels 
database query evaluation, based on an extension of the relational algebra. We give a rough sketch of the 
method here: further details can be found in, e.g., f Moh04IPLT G07 1 . Given a structure (database) A, a 
boolean query is a formula (p over the vocabulary of A, and query answering is evaluating whether (j) is 
true, i.e., A \^ (j). In the context of grounding, cf) has some additional vocabulary beyond that of A, and 
producing a reduced grounding involves evaluating out the instance vocabulary, and producing a ground 
formula representing the expansions of A for which cj) is true. 

For each sub-formula a{x) with free variables x, we call the set of reduced groundings for a under 
all possible ground instantiations of x an answer to a{x). We represent answers with tables on which the 
extended algebra operates. An X-relation, in databases, is a fc-ary relation associated with a fc-tuple of 
variables X, representing a set of instantiations of the variables of X. Our grounding method uses extended 
X-relations, in which each tuple a is associated with a formula. In particular, if R is the answer to a{x), 
then R consists of the pairs (a, a{a)). Since a sentence has no free variables, the answer to a sentence (j) is 
a zero-ary extended X-relation, containing a single pair associating the empty tuple with formula 

ip, which is a reduced grounding of 4>. 

The relational algebra has operations corresponding to each connective and quantifier in FO; comple- 
ment (negation); join (conjunction); union (disjunction), projection (existential quantification); division or 
quotient (universal quantification). Each generalizes to extended X-relations. If (a, a{a)) G TZ then we 
write 6Ti{a) = a{a). For example, the join of extended X-relation TZ and extended F-relation S (both 
over domain A), denoted TZ m S, is the extended X U F-relation {{a,ip) \ a : X U Y ^ A,a\x G 
TZ, ajy G S, and ijj = 6-jz{a\x) A Ss{a\Y)}'-, It is easy to show that, if TZ is an answer to ai{x) and S is an 
answer to a2{y) (both wrt A), then 7?, x 5 is an answer to ai{x) A ct2{y)- The analogous property holds 
for the other operators. 

To ground with this algebra, we define the answer to atomic formula P{x) as follows. If P is an instance 
predicate, the answer is the set of tuples (a, T), for a e P"^. If P is an expansion predicate, the answer is 
the set of all tuples (a, P{a)), for a a tuple of elements from the domain of A. Then we apply the algebra 
inductively, bottom-up, on the structure of the formula. At the top, we obtain the answer to (p, which is a 
relation containing only the pair ((), ip), where ip isa reduced grounding of (p wrt A. 

Example 3. Let a {P} and e = {E}, and let A he. a cr-structure with P-^ = {(1, 2, 3), (3, 4, 5)}. The 
following extended relation TZ is an answer to (pi = P{x, y, z) A E{x, y) A E{y, z): 



X 


y 


z 


i> 


1 


2 


3 


i5(l,2) Ai5(2,3) 


3 


4 


5 


S(3,4) A£;(4,5) 



Observe that (57j(l, 2, 3) = E{1, 2)AE{2, 3) is a reduced grounding of 0i [(1, 2, 3)] = 2, 3)AE{1, 2)A 
E{2, 3), and S-ji{l, 1, 1) = _L is a reduced grounding of 0i[(l, 1, 1)]. 
The following extended relation is an answer to <p2 = 3z(pi : 



X 


y 




1 


2 


£(1,2) A £(2, 3) 


3 


4 


E{3,4)AE{4, 5) 



Here, £^(1, 2) A E{2, 3) is a reduced grounding of 02 [(Ij 2)]. Finally, the following represents an answer to 
(p3 = 3x3y(p2, where the single formula is a reduced grounding of cp^. 

[i;(l,2)Aig(2,3)] V[ig(3,4)Aig(4,5)] 



To modify the algorithm to ground using €L(V{(f>, A) we need only change the base case for expansion 
predicates. To be precise, if P is an expansion predicate we set the answer to P{x) to the set of pairs (a, ip) 
such that: 

(P{h) if P^™^-^)(a) =C50 
^= <T if p£2^P(0.-4)(a) 

[± if P^"^('^'-4)(a) = ±. 

Observe that bottom-up grounding mimics the second phase of Algorithm [3] i.e., a bottom-up truth 
propagation, except that it also propagates the falses. So, for bottom up grounding, we can omit line 7 from 
Algorithm[3] 

Proposition 2. Let {{),ip) be the answer to sentence <j) wrt A after LUP initialization, then: 

where Gndmcpi^^ j^-^ (</>, 0) is the result of top-down grounding Algorithm \2\of (f) over LUP structure 
CUV{(j),A). 

This bottom-up method uses only the reduct of £UV{<j), A) defined hy a U e U A, not the entire LUP 
sti'ucture. 

6 Experimental Evaluation of LUP 

In this section we present an empirical study of the effect of LUP on grounding size and on grounding and 
solving times. We also compare LUP with GWB in terms of these same measures. The implementation of 
LUP is within our bottom-up grounder Enfragmo, as described in this paper, and the implementation of 
GWB is in the top-down grounder GidL, which is described in [WMDOSb WMDIOII . GidL has several 
parameters to control the precision of the bounds computation. In our experiments we use the default 
settings. We used MiniSat as the ground solver for Enfragmo. GidL produces an output specifically for 
the ground solver MiniSat(ID), and together they form the IDP system | WMD 08d|. 

We report data for instances of three problems: Latin Square Completion, Bounded Spanning Tree 
and Sudoku. The instances are latin_square. 17068* instances of Normal Latin Square Completion, the 
104_rand_45^50_* and 104_rand_35.250.* instances of BST, and the ASP contest 2009 instances of Su- 
doku from the Asparagus repositorjQ. All experiments were run on a Dell Precision T3400 computer with 
a quad-core 2.66GHz Intel Core 2 processor having 4MB cache and 8GB of RAM, running CentOS 5.5 
with Linux kernel 2.6. 18. 

In Tables 2 and 4, columns headed "Literals" or "Clauses" give the number of literals or clauses in 
the CNF formula produced by Enfragmo without LUP (our baseline), or these values for other grounding 
methods expressed as a percentage of the baseline value. In Tables 3 and 5, all values are times seconds. 
All values give are means for the entire collection of instances. Variances are not given, because they are 
very small. We split the instances of BST, into two sets, based on the number of nodes (35 or 45), because 
these two groups exhibit somewhat different behaviour, but within the groups variances are also small. In 
all tables, the minimum (best) values for each row are in bold face type, to highlight the conditions which 
gave best performance. 

Table 2 compares the sizes of CNF formulas produced by Enfragmo without LUP (the base line) with 
the formulas obtained by running UP on the baseline formulas and by running Enfragmo with LUP. Clearly 
LUP reduces the size at least as much as UP, and usually reduces the size much more, due to the removal 
of autarkies. 

Total time for solving a problem instance is composed of grounding time and SAT solving time. TableO 
compares the grounding and SAT solving time with and without LUP bounds. It is evident that the SAT 
solving time is always reduced with LUP. This reduction is due to the elimination of the unit clauses and 
autark subformulas from the grounding. Autark subformula elimination also affects the time required to 
convert the ground formula to CNF which reduces the grounding time, but in some cases the overhead 
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Enfragmo 


Enfragmo+UP (%) 


Enfragmo+LUP (%) 


Problem 


Literals 


Clauses 


Literals 


Clauses 


Literals 


Clauses 


Latin Square 


7452400 


2514100 


0.07 


0.07 


0.07 


0.07 


BST 45 


22924989 


9061818 


0.96 


0.96 


0.24 


0.24 


BST35 


8662215 


3415697 


0.95 


0.96 


0.37 


0.37 


Sudoku 


2875122 


981668 


0.17 


0.18 


0.07 


0.08 



Table 2: Impact of LUP on the size of the grounding. The first two columns give the numbers of literals and 
clauses in groundings produced by Enfragmo without LUP (the baseline). The other columns give these 
measures for formulas produced by executing UP on the baseline groundings (Enfragmo+UP), and for 
groundings produced by Enfragmo with LUP (Enfragmo+LUP), expressed as a fraction baseline values. 





Enfragmo 


Enfragmo with LUP 


Speed Up Factor 


Problem 


Gnd 


Solving 


Total 


Gnd 


Solving 


Total 


Gnd 


Solving 


Total 


Latin Square 


0.89 


1.39 


2.28 


3.27 


0.34 


3.61 


-2.38 


1.05 


-1.33 


BST 45 


6.08 


7.56 


13.64 


2 


1.74 


3.74 


4.07 


5.82 


9.9 


BST 35 


2.13 


2.14 


4.27 


1.07 


0.46 


1.53 


1.06 


1.68 


2.74 


Sudoku 


0.46 


1.12 


1.59 


2.08 


0.26 


2.34 


-1.62 


0.86 


-0.76 



Table 3: Impact of LUP on reduction in both grounding and (SAT) solving time. Grounding time here 
includes LUP computations and CNF generation. 



imposed by LUP computation may not be made up for by this reduction. As the table shows, when LUP 
outperforms the normal grounding we get factor of 3 speed-ups, whereas when it loses to normal grounding 
the slowdown is by a factor of 1.5. 

Table |4] compares the size reductions obtained by LUP and by GWB in GidL. The output of GidL 
contains clauses and rules. The rules are transformed to clauses in (MiniSat(ID)). The measures reported 
here are after that transformation. LUP reduces the size much more than GWB, in most of the cases. This 
stems from the fact that GidL's bound computation does not aim for completeness wrt unit propagation. 
This also affects the solving time because the CNF formulas are much smaller with LUP as shown in Ta- 
blelH Table|5]shows that Enfragmo with LUP and MiniSat is always faster than GidL with MiniSat(ID) 
with or without bounds, and it is in some cases faster than Enfragmo without LUP. 

7 Discussion 

In the context of grounding -based problem solving, we have described a method we call lifted unit propaga- 
tion (LUP) for carrying out a process essentially equivalent to unit propagation before and during ground- 
ing. Our experiments indicate that the method can substantially reduce grounding size - even more than 
unit propagation itself, and sometimes reduce total solving time as well. 





Enfragmo (no LUP) 


GidL (no bounds) 


Enfragmo with LUP 


GidL with bounds 


Problem 


Literals 


Clauses 


Literals 


Clauses 


Literals 


Clauses 


Literals 


Clauses 


Latin Square 


7452400 


2514100 


0.74 


0.84 


0.07 


0.07 


0.59 


0.61 


BST 45 


22924989 


9061818 


0.99 


1.02 


0.24 


0.24 


0.25 


0.24 


BST 35 


8662215 


3415697 


1.01 


1.04 


0.37 


0.37 


0.39 


0.39 


Sudoku 


2875122 


981668 


0.56 


0.6 


0.07 


0.08 


0.38 


0.39 



Table 4: Comparison between the effectiveness of LUP and GidL Bounds on reduction in grounding size. 
The columns under Enfragmo show the actual grounding size whereas the other columns show the ratio of 
the grounding size relative to that of Enfragmo (without LUP). 





Enfragmo 


IDP 


Enfragmo+LUP 


IDP (Bounds) 


Problem 


Gnd 


Solving 


Total 


Gnd 


Solving 


Total 


Gnd 


Solving 


Total 


Gnd 


Solving 


Total 


Latin Square 


0.89 


1.39 


2.28 


3 


4.63 


7.63 


3.27 


0.34 


3.61 


2.4 


3.81 


6.21 


BST 45 


6.08 


7.56 


13.64 


7.25 


20.84 


28.09 


2 


1.74 


3.74 


1.14 


4.45 


5.59 


BST35 


2.13 


2.14 


4.27 


2.63 


6.31 


8.94 


1.07 


0.46 


1.53 


0.67 


2.73 


3.4 


Sudoku 


0.46 


1.12 


1.59 


1.81 


1.3 


3.11 


2.08 


0.26 


2.34 


2.85 


0.51 


2.37 



Table 5: Comparison of solving time for Enfragmo and IDP, with and without LUP/bounds. 



Our work was motivated by the results of fl WMDOSblWMD lOl . which presented the method we have 
referred to as GWB. In GWB, bounds on sub-formulas of the specification formula are computed without 
reference to an instance structure, and represented with FO formulas. The grounding algorithm evaluates 
instantiations of these bound formulas on the instance structure to determine that certain parts of the naive 
grounding may be left out. If the bound formulas exactly represent the information unit propagation can 
derive, then LUP and GWB are equivalent (though implemented differently). However, generally the GWB 
bounds are weaker than the LUP bounds, for two reasons. First, they must be weaker, because no FO 
formula can define the bounds obtainable with respect to an arbitrary instance structure. Second, to make 
the implementation in GidL efficient, the computation of the bounds is heuristically truncated. This led us 
to ask how much additional reduction in formula size might be obtained by the complete LUP method, and 
whether the LUP computation could be done fast enough for this extra reduction to be useful in practice. 

Our experiments with the Enfragmo and GidL grounders show that, at least for some kinds of problems 
and instances, using LUP can produce much smaller groundings than the GWB implementation in GidL. In 
our experiments, the total solving times for Enfragmo with ground solver MiniSat were always less than 
those of GidL with ground solver MiniSat(ID). However, LUP reduced total solving time of Enfragmo 
with MiniSat significantly in some cases, and increased it — albeit less significantly — in others. Since 
there are many possible improvements of the LUP implementation, the question of whether LUP can be 
implemented efficiently enough to be used all the time remains unanswered. 

Investigating more efficient ways to do LUP, such as by using better data structures, is a subject for 
future work, as is consideration of other approximate methods such, as placing a heuristic time-out on the 
LUP structure computation, or dovetailing of the LUP computation with grounding. We also observed that 
the much of the reduction in grounding size obtained by LUP is due to identification of autark sub-formulas. 
These cannot be eUminated from the naive grounding by unit propagation. Further investigation of the 
importance of these in practice is another direction we are pursuing. One more direction we are pursuing 
is the study of methods for deriving even stronger information that represented by the LUP structure, to 
further reduce ground formula size, and possibly grounding time as well. 
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